Pattern recognition in genetic sequences.
نویسنده
چکیده
This paper announces an algorithm for finding pattern similarities between two given finite sequences. Two portions, one from each sequence, are similar if they are close in the metric space of evolutionary distances. In its most general form the algorithm allows a complete list to be made of all pairs of intervals, one from each of the two given sequences, such that each pair displays a maximum local degree of similarity; if the lengths of the sequences are m and n, then the algorithm requires on the order of mn steps. This result lends itself to detecting similarities by computer between pairs of biological sequences, such as proteins and nucleic acids.
منابع مشابه
An Evolutionary and Phylogenetic Study of the BMP15 Gene
DNA sequence data contains a wealth of biologically useful information. Recent innovations in DNA sequencing technology have greatly increased our capacity to determine massive amounts of nucleotide sequences. These sequences can be used to specify the characteristics of different regions, interpret the evolutionary relationships between categorized groups, likelihood of performing multiple com...
متن کاملIdentification and Functional Prediction of Long Non-Coding RNAs Responsive to Drought stress in Lens culinaris L.
Drought stress is one of the main environmental factors that affects growth and productivity of crop plants, including lentil. In the course of evolution evolution, crucial genetic regulations mediated by non-coding RNAs (ncRNAs) have emerged in plant in response to drought and other abiotic stresses. In the present study, after identifying lncRNAs within the expression profile of lentil, RNA-s...
متن کاملFinding Exact and Solo LTR-Retrotransposons in Biological Sequences Using SVM
Finding repetitive subsequences in genome is a challengeable problem in bioinformatics research area. A lot of approaches have been proposed to solve the problem, which could be divided to library base and de novo methods. The library base methods use predetermined repetitive genome’s subsequences, where library-less methods attempt to discover repetitive subsequences by analytical approach...
متن کاملBioinformatics Study and Investigation of the Expression Pattern of Several Important Genes Involved in Glycyrrhizin Synthesis of Glycyrrhiza glabra L. in Autumn and Spring Seasons
Glycyrrhiza is one of the important medicinal plants that is in danger of extinction. Search for finding accessions that have a higher glycyrrhizic acid is very important in breeding programs. Functional genomics methods such as EST sequencing prepare the ability to identify consensus gene families among studied species and interpretation of the genome. In this research, 55960 EST sequences of ...
متن کاملEvolutionary computation method for pattern recognition of cis-acting sites.
This paper develops an evolutionary method that learns inductively to recognize the makeup and the position of very short consensus sequences, cis-acting sites, which are a typical feature of promoters in genomes. The method combines a Finite State Automata (FSA) and Genetic Programming (GP) to discover candidate promoter sequences in primary sequence data. An experiment measures the success of...
متن کاملIdentification of Rare Genetic Disorder from Single Nucleotide Variants Using Supervised Learning Technique
Received Aug 29, 2017 Revised Oct 31, 2017 Accepted Nov 14, 2017 Muscular dystrophy is a rare genetic disorder that affects the muscular system which deteriorates the skeletal muscles and hinders locomotion. In the finding of genetic disorders such as Muscular dystrophy, the disease is identified based on mutations in the gene sequence. A new model is proposed for classifying the disease accura...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Proceedings of the National Academy of Sciences of the United States of America
دوره 76 7 شماره
صفحات -
تاریخ انتشار 1979